{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "zGBzUG6ThEU_"
   },
   "source": [
    "# Test Generators\n",
    "If you want to create a list of tests, you can do it with a list comprehension.\n",
    "As an example we create tests for quantiles.\n",
    "\n",
    "But before that we prepare reference and current test data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from sklearn.datasets import fetch_openml\n",
    "\n",
    "\n",
    "data = fetch_openml(name='adult', version=2, as_frame='auto')\n",
    "data.frame['target'] = data.frame['education-num']\n",
    "data.frame['preds'] = data.frame['education-num'].values + np.random.normal(0, 6, data.frame.shape[0])\n",
    "reference_data = data.frame[:20000]\n",
    "current_data = data.frame[20000:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can import the quantiles test and test suite classes from **evidently** and add a list of test with different quantiles to our suite:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from evidently.tests import TestColumnQuantile\n",
    "from evidently.test_suite import TestSuite\n",
    "\n",
    "\n",
    "suite = TestSuite(tests=[\n",
    "    TestColumnQuantile(column_name=\"education-num\", quantile=quantile) for quantile in [0.5, 0.9, 0.99]\n",
    "])\n",
    "\n",
    "suite.run(current_data=current_data, reference_data=reference_data)\n",
    "suite"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or you can do it for a list with specific columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from evidently.tests import TestColumnValueMin\n",
    "\n",
    "\n",
    "# test that values in specified columns are positive (greater than zero condition)\n",
    "suite = TestSuite(tests=[\n",
    "    TestColumnValueMin(column_name=column_name, gt=0) for column_name in [\"age\", \"fnlwgt\", \"education-num\"]\n",
    "])\n",
    "\n",
    "suite.run(current_data=current_data, reference_data=reference_data)\n",
    "suite"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But if you want to use column names in tests generation, sometimes you can get them before the suite calculation is already launched.\n",
    "\n",
    "In this case you can use test generators:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from evidently.tests.base_test import generate_column_tests\n",
    "\n",
    "from evidently.tests import TestColumnShareOfMissingValues\n",
    "\n",
    "suite = TestSuite(tests=[generate_column_tests(TestColumnShareOfMissingValues)])\n",
    "suite.run(current_data=current_data, reference_data=reference_data)\n",
    "suite.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default, it generates tests with no parameters for all columns.\n",
    "If you want to set custom parameters for all generated tests, you can do it with a separate parameter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set condition for all generated tests and specify that we want to generate tests for all columns explicitly\n",
    "suite = TestSuite(tests=[generate_column_tests(TestColumnShareOfMissingValues, columns=\"all\", parameters={\"lt\": 0.5})])\n",
    "suite.run(current_data=current_data, reference_data=reference_data)\n",
    "suite"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you want to generate tests for numeric features only, you can set parameter `columns` to \"num\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "suite = TestSuite(tests=[generate_column_tests(TestColumnValueMin, columns=\"num\")])\n",
    "suite.run(current_data=current_data, reference_data=reference_data)\n",
    "suite"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you want to generate tests for category features only, you can set parameter `columns` to \"cat\""
   ]
  },
  {
   "cell_type": "markdown",
   "source": [],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "markdown",
   "source": [],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "suite = TestSuite(tests=[generate_column_tests(TestColumnShareOfMissingValues, columns=\"cat\", parameters={\"lt\": 0.1})])\n",
    "suite.run(current_data=current_data, reference_data=reference_data)\n",
    "suite"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also you can call the generator with columns list:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "suite = TestSuite(tests=[generate_column_tests(TestColumnValueMin, columns=[\"age\", \"fnlwgt\", \"education-num\"], parameters={\"gt\": 0})])\n",
    "suite.run(current_data=current_data, reference_data=reference_data)\n",
    "suite"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "# Metrics generator\n",
    "Metrics generator works like test's - you can use a list comprehension or support function"
   ],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from evidently.metrics import ColumnValueRangeMetric\n",
    "from evidently.metrics.base_metric import generate_column_metrics\n",
    "from evidently.report import Report\n",
    "\n",
    "report = Report(\n",
    "    metrics=[\n",
    "        generate_column_metrics(\n",
    "            ColumnValueRangeMetric,\n",
    "            columns=\"num\",\n",
    "            parameters={\"left\": 0, \"right\": 10}\n",
    "        )\n",
    "    ]\n",
    ")\n",
    "report.run(current_data=current_data, reference_data=reference_data)\n",
    "report"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "example_tests.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
